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Representing molecule-surface interactions with symmetry-adapted neural networks 



O 

o 

(N 
X) 

PLh 

(N 
(N 



Jorg Behler, Sonke Lorenz, and Karsten Reuter 
Fritz-Haber-Institut der Max- Planck- Ges ells chaft, Faradayweg 4-6, D-14195 Berlin, Germany 

(Dated: February 6, 2008) 

The accurate description of molecule-surface interactions requires a detailed knowledge of the 
underlying potential-energy surface (PES). Recently, neural networks (NNs) have been shown to be 
an efficient technique to accurately interpolate the PES information provided for a set of molecular 
configurations, e.g. by first-principles calculations. Here, we further develop this approach by 
building the NN on a new type of symmetry functions, which allows to take the symmetry of 
the surface exactly into account. The accuracy and efficiency of such symmetry-adapted NNs is 
illustrated by the application to a six-dimensional PES describing the interaction of oxygen molecules 
with the Al(lll) surface. 

PACS numbers: 68.35. Ja,82.20.Kh,02.60.Ed 
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I. INTRODUCTION 



Molecule-surface interactions play a cental role in 
many technologically relevant processes like heteroge- 
neous catalysis, semiconductor growth and corrosion. A 
detailed investigation of the underlying elementary steps 
at an atomic scale, e.g. physisorption, chemisorption, dis- 
sociation, diffusion, and desorption, is crucial to gain the 
deeper understanding needed to identify new catalysts, 
candidates for protective surface coatings, and chemically 
inert surfaces. A central quantity in all these processes 
is the potential-energy surface (PES), which gives the 
potential-energy of the system as a function of the posi- 
tions of the nuclei. First-principles calculations, and in 
particular density- functional theory (DFT) , have become 
an important tool in providing accurate, often quantita- 
tive information on PESs. Nevertheless, the computa- 
tional demands connected with the calculation of every 
single PES point for systems involving an extended solid 
surface still impose severe constraints, e.g. limiting the 
number and time span of "on-the-fly" ab initio molecu- 
lar dynamics (MD) trajectories to study the molecule- 
surface interaction. Consequently, important physical 
quantities like sticking coefficients, which result from sta- 
tistical averaging and which thus require e.g. a large 
number of MD runs to obtain converged results, are still 
hardly accessible in this way. 

As one possibility to determine such quantities from 
first-principles, GroB and Scheffler— introduced a "divide 
and conquer" approach, which allows to speed up the 
calculation of the MD trajectories by splitting the prob- 
lem into three steps. First, the multi-dimensional PES 
is mapped on a finite grid by calculating the energies 
for a number of configurations using an accurate method 
like DFT. In a second step these points are interpolated 
to a continuous PES representation by an appropriate 
method that allows to access the energy and the forces 
several orders of magnitude faster than the original DFT 
calculations. In the mentioned example of MD simula- 
tions, the forces at any nuclear configuration can then 



be obtained from the interpolated PES representation at 
low computational cost, thereby permitting the calcula- 
tion of a sufficient number of MD trajectories. 

Several approaches have been proposed for such in- 
terpolation schemes. In analytical fits^i^i^i^ a reasonable 
functional form is "guessed" by physical intuition and the 
free parameters of this functional form are optimized to 
represent the set of DFT energies as accurately as pos- 
sible. If an appropriate functional form can be found, 
this approach minimizes the required first-principles in- 
put and avoids spurious, unphysical features in the PES. 
On the other hand, with increasing dimensionality of 
the PES, the construction of suitable functional forms 
becomes increasingly involved, and the fixed functional 
form makes this approach also quite infiexible. First at- 
tempts to optimize the functional form itself in the fit- 
ting process by applying a genetic programming tech- 
nique have hitherto only been reported for up to three- 
dimensional PESsS,. An alternative to analytic fits is the 
modified Shepard interpolation^i^. In this method the 
potential close to a calculated DFT point is expanded 
as a second-order Taylor series. The potential of a new 
configuration is then constructed as a weighted sum over 
the Taylor expansions with respect to the neighboring 
DFT points. The fitting procedure can be further sim- 
plified by reducing the corrugation of the PES — liSiii. 
For PESs describing a molecule-surface interaction this 
can e.g. be achieved by subtracting separately calcu- 
lated PESs describing the interactions of the individual 
atoms of the molecule with the surface. Finally, we also 
mention an approach to efficiently represent PESs by pa- 
rameterizing a tight-binding Hamiltonian^^. In principle, 
this approach requires only a relatively small number of 
DFT calculations in order to obtain accurate fits to high- 
dimensional PESs. In practice, the scheme is, however, 
unfortunately hampered by the indirect and cumbersome 
tight-binding parameterization procedure. 

In recent years, neural networks^- (NN) were also 
successfully applied to fit PESs of small gas-phase 
molecule a^"^'^^i^^'^^i^^'^^i^°i^^ . Neural networks form a 
very general class of functions^^i22,, which in princi- 



pie can approximate any function to arbitrary accu- 
racy without requiring any information about the un- 
derlying functional form of the problem. First ap- 
plications to the description of molecule-surface inter- 
actions have demonstrated the capabilities of this ap- 
proach2ii2^i2£i22^ but were complicated by technical difh- 
culties in achieving a proper consideration of the symme- 
try of the solid surface^. The aim of the present paper 
is therefore to overcome these limitations by introduc- 
ing a symmetry-adapted neural network representation 
of PESs for molecule-surface interactions, which is based 
on a new type of symmetry functions that takes the sym- 
metry of the surface potential exactly into account. The 
accuracy and efficiency of such a NN representation, in 
particular for MD simulations, is demonstrated by study- 
ing the interaction of oxygen with the Al(lll) surface. 
For this, we concentrate on the six-dimensional PES de- 
scribing the interaction with an O2 molecule constrained 
to its spin-triplet state, which could recently be shown to 
be of crucial importance in understanding the low initial 
sticking coefficient reported experimentallj^i^. 



II. NEURAL NETWORKS 

Inspired by the neural signal processing in biological 
systems^, the first artificial neural networks have been 
introduced in 1959^. Since then NN techniques have be- 
come a standard tool in many fields of research. While 
they have been applied mainly to pattern recognition and 
classification problems, neural networks also represent a 
very general fitting scheme that in principle allows to 
approximate any function to arbitrary accuracy^'^'^. No 
previous knowledge about the underlying functional form 
is required. Instead, a number of known values of the 
function to be fitted is presented to the NN in order to 
adapt a rather large number of parameters using an op- 
timization algorithm. Out of the many types of neural 
networks, the class of multilayer feed-forward neural net- 
works'^ has particularly proven to be a useful tool for 
the representation of potential-energy surfaces^ii^. 

The general structure of such a multi-layer feed- 
forward neural network is shown schematically in Fig. [1] 
It consists of an input layer, one or more hidden layers 
and an output layer. In each layer there is a certain 
number of nodes. When representing a PES by a NN, 
the relevant coordinates of the system determining the 
potential are fed into the input layer of the NN. There 
are many possible choices for the input coordinates 
depending on the system to be studied. In the most 
simple case of the one-dimensional interaction potential 
of an isolated diatomic molecule the bond length would 
e.g. be an appropriate choice as input coordinate. For 
molecules several bond lengths and angles can be used as 
input, as has been shown for the PESs of several small 
moleculesi^ii^. The node in the output layer provides 
the target quantity, i.e., in the present case the potential 
energy of the system. However, NNs are not constrained 
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FIG. 1: Schematic structure of a feed- forward neural network 
(NN). Two coordinates {C^} defining the molecular configu- 
ration are provided to the NN in the nodes of the input layer. 
The total energy E of the structure calculated by the NN is 
given in the node of the output layer. Between the input and 
the output layer is a hidden layer with three nodes. All nodes 
are connected to the nodes in the adjacent layers by weight 
parameters Wij that are optimized by fitting the output en- 
ergies to a provided training set of known input energies, e.g. 
from density-functional theory calculations. The bias node 
acts as an adjustable offset for nonlinear activation functions 
applied at each node in the hidden and the output layer (cf. 
text). 

to fit only one quantity, and it would for example also be 
possible to fit the potential and the forces acting on the 
atoms simultaneously. 

In between the input and the output layer, there are 
one or more hidden layers, each with a certain number 
of nodes. The term "hidden layer" indicates that the nu- 
merical values at the nodes of these layers have no phys- 
ical meaning and are just auxiliary mathematical quan- 
tities. Each node in each layer is connected to the nodes 
in the adjacent layers by so-called weights, the fitting pa- 
rameters of the NN. As illustrated in Fig. [U the weight 
parameter w^^ is connecting node j in layer k with node 
i in layer fc — 1, where the input layer has the superscript 
k — Q. The value of node j in layer k is obtained from 

the values y^~^ of all nodes i in the preceding layer k — 1 
and from the connecting weights by 

2/' = /^(^< + E4yf"') ' (1) 

i.e., for each node the values of the nodes in the previ- 
ous layer are multiplied by the respective weights and 
added up to yield a single number. On this number a 
so-called activation function is applied, which can be 
different for each layer. Activation functions are typically 
sigmoidally shaped non- linear functions, which introduce 
the capability to fit non-linear functions into the NN. 
Frequently employed functional forms are the hyperbolic 
tangent or Fermi-like functions. For very large or very 
small arguments the activation functions converge to a 



3 



constant number, but for a certain interval the output 
changes significantly in a non- linear way. In contrast, a 
linear activation function is often employed in the output 
layer to avoid a constraint on the possible range of output 
values. Finally, there is a bias weight connected to each 
node in the hidden layers and the output layer, which acts 
as an adjustable offset to shift the non-linearity regime of 
the activation functions as needed to obtain an optimal 
fit. 

The full analytic form for the small model NN shown 
in Fig. [T] is given by the expression 

men) = fa (wi, + ^ + E j 

(2) 

In general, each layer including input and output layers 
can contain many more nodes than in the simple exam- 
ple shown in Fig. [1] and also more than one hidden layer 
is typically used. The number of layers and nodes de- 
termines the analytic form of the NN, and also analytic 
derivatives, i.e., the forces, can be calculated. The over- 
all NN architecture can then be described following the 
scheme suggested in Ref. [13. In this scheme, the NN in 
Fig.[T]is a {2 — 3 — 1 tl} network, where the first number 
indicates the number of nodes in the input layer, the last 
number the number of nodes in the output layer, and the 
number(s) in between the number(s) of nodes in the hid- 
den layer (s). The employed activation function in each 
layer is labeled as t, if a hyperbolic tangent is applied, 
or as I for a linear function. In the present example, a 
hyperbolic tangent is thus used in the hidden layer, and 
a linear function in the output layer, hence tl. 

In order to construct a continuous PES representation 
from a number of known training energies (for example 
obtained from DFT), the weight parameters of the NN 
are optimized in an iterative way to reproduce these input 
energies as accurately as possible. Initially the weights 
are chosen randomly, and for each known configuration 
(i.e. point in the PES landscape) the potential predicted 
by the NN is calculated. Taking these values and the cor- 
responding original input energies, an error function can 
be constructed. This error function is then minimized 
to optimize the weight parameters and thus requires the 
calculation of the derivatives of the output energies with 
respect to each w^j. Standard optimization algorithms 
like steepest descent, which is called backpropagation in 
the NN context, or conjugate gradient can be used to find 
the optimum set of parameters. In particular for NNs 
also the extended Kalman filter— i^ii^^ (EKF) has proven 
to be a useful optimization tool. In the EKF the weights 
are adapted not after the presentation of the full input 
data set, but after the presentation of each single data 
point, which is intended to make the optimization pro- 
cess less sensitive to local minima in the high-dimensional 
optimization space.— Additionally, information from pre- 
vious weight updates is included by using a modified cost 
function including a weighting factor for previous itera- 
tions. For details on the EKF we refer to Ref. [13 and 



references therein. Once an optimal set of weight param- 
eters has been found, the NN constitutes the continuous 
PES representation, i.e. it can be employed to predict 
the energy and forces at any point in the PES landscape. 

A priori it is not known which network architecture 
will be best for a given fitting problem, and tests are 
necessary to find the optimum number of hidden layers 
and nodes, as well as the best activation functions for 
a given system. Too few nodes in the hidden layers will 
typically result in underfitting, i.e., important features of 
the PES will be erroneously smoothed out. More nodes 
increase the flexibility of the NN, but can lead to overfit- 
ting, i.e., artificial features appear in the PES. In general, 
an advisable strategy seems therefore to use the smallest 
possible NN that yields the desired accuracy. 

In order to check on overfitting, one can split the full 
input data set into a training set used to optimize the NN 
parameters, and a test data set, which is not employed in 
the fit. The set of NN parameters that yields the lowest 
error for the test set may then be interpreted as having 
the best predictive power for unknown structures. 



III. SYMMETRY-ADAPTED NEURAL 
NETWORKS 

The intention of the present work is to employ NNs 
to represent the PES describing the interaction of a 
molecule approaching an extended single-crystal surface. 
Addressing an important class of molecules, we will focus 
our discussion on diatomic molecules, and employ fur- 
thermore the so-called frozen-surface approximation. A 
possible generalization of our approach is then discussed 
in Section fVl 

In the frozen-surface approximation, the substrate 
atoms are assumed to be fixed to their positions pertinent 
to the clean surface. The dimensionality of the problem 
is thereby reduced to the degrees of freedom representing 
the impinging molecule, i.e. six dimensions for the case of 
diatomic molecules. Instead of the Cartesian coordinates 
of the two atoms, usually the center of mass coordinates 
A, and Z (where the x- and y-axes are parallel to 
the surface, and the z axis points out of plane), as well 
as the molecular bond length r, the angle between the 
molecular axis and the surface normal 9, and the angle 
between the projection of the molecular axis into the sur- 
face plane and the positive a;-axis are used to describe 
the molecular configuration. These coordinates can be 
grouped into two classes: the coordinates X, Y, 6 and (j), 
in which the potential is periodic, and the "non-periodic" 
coordinates Z and r. 

Due to the lateral symmetry of the solid surface it is 
sufficient to map the PES only for molecular configu- 
rations inside the irreducible wedge of the surface unit- 
cell. However, after the NN parameterization it is usually 
desirable to also have the appropriate PES information 
available outside this irreducible wedge. In MD simu- 
lations intended to be run on the NN represented PES, 
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the molecule should for example not be constrained to a 
motion inside the irreducible wedge only, but should be 
able to move from one wedge to another. Consequently, 
the NN has to provide the energy and forces for all pos- 
sible molecular configurations also outside the symmetry 
unique wedge. A straightforward solution would be to 
expand the energetic input data to a wider range of lat- 
eral X and Y coordinates by suitably copying the data 
inside the irreducible wedge into neighboring wedges be- 
fore the fit, and then to train the NN to a larger surface 
area. Such a procedure is necessarily very inefficient due 
to the increased input data set and larger NN size, i.e., 
number of parameters that would be required. Further- 
more, the NN fit is necessarily only an approximation to 
the underlying energetic data, and if parts of the PES, 
which are equivalent by symmetry, are fitted indepen- 
dently, the symmetry is numerically broken. 

A solution to this problem has been suggested by 
Lorenz, GroB and SchefFler by introducing functions in- 
cluding the symmetry of the surfacei^i^l Basically, the 
problem of assigning energies to molecular configurations 
is separated into two steps. In the first step the molecu- 
lar coordinates are mapped on the "symmetry functions" , 
which describe the symmetry of the surface, and thereby 
the symmetry of the PES. The resulting symmetry func- 
tion values replace the original molecular coordinates as 
input for the NN, and in a second step the energies are as- 
signed to the symmetry function values by the NN. This 
procedure ensures that the symmetry properties of the 
symmetry functions are fully included in the NN repre- 
sentation of the PES. The construction of these symme- 
try functions is similar to the construction of the func- 
tional form in an analytic fit^. Yet, the function values 
do not need to yield the total energy of the system, but 
only its symmetry in form of a set of function values. 

Unfortunately, the hitherto proposed symmetry func- 
tions are valid only for a given system and are difficult to 
construct even for simple low-dimensional PESs, because 
they have to describe correctly the complex interdepen- 
dence of all "periodic" coor dinates X, Y, 9 and 0^1^ Up 
to now, such functions have correspondingly been con- 
structed using physical intuition, which can easily lead 
to artificial or missing symmetries. Since this can have 
severe consequences on the represented PES topology, 
and thereby on molecular trajectories in MD simulations 
run on the NN interpolated PES, the present work aims 
to improve on this situation by developing a more general 
scheme to construct symmetry functions, which include 
the full symmetry of a given system and are free of spuri- 
ous symmetries not present in the true molecule-surface 
interaction. 




FIG. 2: Top view illustrating the symmetry of a fcc(lll) 
surface. The irreducible wedge of the surface unit-cell (dark 
grey triangle) is spanned by the top, fee and hep sites. The 
dotted lines represent mirror planes and the white triangles 
show the positions of threefold rotation axes perpendicular to 
the surface. The first layer substrate atoms are shown as grey 
circles, and the coordinate system illustrates the directions of 
the employed in-plane x- and y-axes. 



wedge of the surface unit-cell is shown in Fig. [2l The 
position of the atom over the surface is uniquely defined 
by its three Cartesian coordinates X, Y, and Z, where 
X and Y are in the surface plane as indicated in Fig. [21 
The lateral symmetry of the surface could be considered 
in a straightforward manner by replacing these coordi- 
nates with the distances dtop, dfcc and dhcp of the atom 
to the closest top, fee and hep surface sites, i.e., the edges 
of the irreducible wedge of the surface unit-cell. If in a 
MD simulation an atom crosses the border of the sym- 
metry unique wedge of the surface, the closest reference 
surface sites change and the set of {dtop, rffcc, dhcp} values 
naturally incorporates the symmetry as shown in Fig. [3] 
for an atomic motion from a top site via a bridge site to 
a neighboring top site. Nevertheless, although describ- 
ing the periodicity of the surface correctly, the set {dtopj 
c^fcc, c'hcp} is not an appropriate choice for the symmetry 
functions to fit the total energy of the system, since in 
MD simulations also the derivatives of the energy with 
respect to the atomic coordinates, i.e., the forces, are re- 
quired. As can be seen in Fig. [3] for the sample atomic 
motion, dtop shows a discontinuity in its derivative at the 
wedge boundary originating from the switch to another 
reference top site, and consequently this discontinuity is 
also present in the forces. 



A. Interaction of an atom with a fcc(lll) surface 

We begin developing our concept by looking at the 
symmetry of the interaction of an atom with a fcc(lll) 
surface. A schematic top view explaining the irreducible 



We solve this problem of discontinuities in the deriva- 
tives by replacing the atomic distances by Fourier terms 
centered at the high-symmetry sites, which describe the 
position of the atom in a unique way as well. For a 
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FIG. 3: Values of the atomic distances dtop, dfcc and dhcp 
of an atom moving along the a;-axis from the top site along 
the bridge site to a neighbored top site, cf. Fig. [2] At the 
bridge site dtop shows a discontinuity in the derivative, which 
is not present in the atomic Fourier terms /top, /fee and /hep 
in Eqs. P1I5)) . 



is the lattice constant of the surface unit-cell. The ex- 
ponential term exp iZ) takes the dependence on the 
vertical distance Z to the surface into account, and en- 
sures in particular that for a large atom-surface distance 
the function values and therefore also the fitted energy 
should become independent of the actual values of X and 
Y. The Fourier terms are compared to the simple dis- 
tance terms in Fig. [3) They have continuous derivatives, 
which demonstrates that this function set is now a suit- 
able choice for fitting the PES of an atom interacting 
with a fcc(lll) surface, while naturally including the full 
symmetry of the surface. This is also directly apparent 
from the full lateral periodicity of the functions /top, /fee 
and /hep that is plotted in Fig. IH 



B. Interaction of a diatomic molecule with a 
fcc(lll) surface 



fcc(lll) surface, these atomic Fourier terms are 
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where the positions of the top, fee and hep sites are given 
by {Xt,Yt), {Xf,Yf) and (Xh,lh), respectively, and a 



It is intuitive to generalize the concept of Fourier 
term based symmetry functions to molecules by build- 
ing on the Fourier terms for each constituent atom of the 
molecule. For a diatomic molecule, the positions of both 
atoms can then e.g. be described by Fourier terms like 
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FIG. 4: (Color online) Fourier terms for the interaction of an atom with a fcc(lll) surface. In (a) the top-term /top, in (b) the 
fcc-term /fee, and in (c) the hep-term /hep, defined in Eqs. ([3] -[5]) respectively, are shown as a function of x and y in units of 
the lattice constant a. In all three figures the white circles represent the position of the first layer substrate atoms. 
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which are functions of the Cartesian coordinates of the 
two atoms [Xi, Yi, Zi) and {X2, 12, ^2)- Apart from the 
constant Ci to which we return below, the only modi- 
fication compared to the atomic terms in Eqs. ([3][5|), is 
the explicit addition of the terms Zf and Z| to the re- 
spective Fourier terms of atoms 1 and 2. This is neces- 
sary, because the multiplication by the exponential term 
exp(— depending only on the center-of-mass dis- 
tance of the molecule from the surface is not sufficient 



to distinguish different separations Zi and Z2 of both 
atoms from the surface. This is resolved by the explicit 
heights Zl or Z2, and Eqs. ([51 - [TT|) then describe the 
positions of both atoms with respect to the top, fee and 
hep sites uniquely. 

What is still not defined, however, is the relative po- 
sition of both atoms to each other, since we obtain the 
same symmetry function values no matter in which wedge 
at the surface, i.e., at which distance from each other, the 
two atoms are located. This can be remedied by adding 
a further symmetry function to the set, which is sim- 
ply given by the distance r between both atoms of the 
molecule. The final set of symmetry functions for a het- 
eronuclear diatomic molecule is thus 



(11) 
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C. Incorporation of internal molecular symmetries 

The symmetry functions defined up to now allow to 
fully take the symmetry of the solid surface into account, 
but do not exploit possibly existing symmetries of the 
molecule itself. For diatomic molecules this would be the 
case for homonuclear molecules like O2, where an inter- 
change of both atoms should not change the symmetry 
function values and therewith the input and output of the 
NN. Considering this additional symmetry explicitly in 
the NN parameterization is obviously desirable. Similar 
to the consideration of the surface symmetries, it may 
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even be mandatory from a numerical point of view to 
avoid artificial symmetry breaking and noise in the fitted 
total energies and forces. 

Within the concept of Fourier term based symmetry 
functions, such internal molecular symmetries may be 
accounted for by suitably combining the atomic Fourier 
terms discussed up to now. For this, it may be conve- 
nient to avoid negative function values and this is the 
reason, why the arbitrary constant Ci has been added to 
the symmetry functions in Eqs. ([S1- [TT|) . In general, such 
a constant can be added at will, without changing any- 
thing in the symmetry properties of the Fourier term. For 
a heteronuclear diatomic molecule adding or not adding 
this constant makes no difference, and one would simply 
choose Ci = 0. However, if one wants to avoid nega- 
tive function values, a sufficiently positive value for this 
constant may equally well be chosen. 

For the example of a homonuclear diatomic molecule, 
the additional symmetry with respect to exchange of the 
two atoms can then e.g. be incorporated by symmetrizing 
and antisymmetrizing the Fourier terms of both atoms 
for each surface site yielding the new set of symmetry 
functions 
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These symmetrized and antisymmetrized atomic Fourier 
terms form "molecular" Fourier terms and still contain 
all structural information. They are plotted in Fig. [5] as 
a function of the center-of-mass coordinates X and Y for 
fixed values of r, Z, 9 and 4>, exhibiting the correct lateral 
periodicities of the sohd surface. 

The proper dependence on the angular orientation of 
the molecule is illustrated in Fig. [6] Here, the symmetry 
function values are plotted for a molecule rotating about 
(j), i.e. for fixed coordinates X, Y, Z, 9 and r. In Fig. [5^ 
the molecule is oriented parallel to the surface above an 
fee site. A rotation by 60° transfers the molecule into an 
energetically equivalent configuration. This is reflected 
in the set of symmetry function 

values, which exhibit a periodicity of 60°. If the 
molecule is not parallel, but tilted with respect to the sur- 
face, the symmetry is reduced from 6-fold to 3-fold. This 
is exemplified in Fig. for 0=30°, and the symmetry 
functions are able to describe this complex interdepen- 
dence between the two angles 9 and </> correctly. Equiv- 
alent performance of the symmetry functions is found 
for other high-symmetry sites at the surface, namely the 
top and hep sites. Equally important, the reduced sym- 
metry at other sites is also properly taken into account. 



as shown in Fig. [6j: for a molecule at a bridge site and 
tilted by 0=30°. The symmetry of the remaining mirror 
plane at the bridge site is reflected in the set of symmetry 
function values by the symmetry at 0=90° and (/)=270°. 
Due to the inequivalence of the hep and fee sites at the 
fcc(lll) surface, there is no symmetry with respect to 
0=0° and 0=180° at the bridge site. Finally, Fig. EJi 
demonstrates the capability of the symmetry functions 
for a low symmetry site at the surface, namely for a 
molecule located at y = 0.5 and Y = ^\/3. Here, the 
set of symmetry functions is different for each value of 
because of the absence of symmetry elements at this site. 



D. Adding redundant symmetry functions 



For diatomic molecules, the derived set of seven sym- 
metry functions (cither G''^ — G"'^ for heteronuclear 
molecules or G^— G*" for homonuclear molecules) contains 
already exhaustive information on the molecular config- 
uration. However, adding further (redundant) symmetry 
functions may nevertheless improve the numerical accu- 
racy of the fit. For this it is important to realize that 
the molecular coordinates are always only mapped onto 
the symmetry functions, which in turn provide the in- 
put for the NN. There is no need to ever invert this 
procedure, i.e. to reconstruct the coordinates of each 
atom in the molecule that correspond to a given set of 
symmetry function values. For the case of a diatomic 
molecule, the six atomic coordinates may therefore be 
mapped onto an arbitrarily large set of symmetry func- 
tions, if this only helps to achieve a good NN fit to the 
PES. On the other hand, one also has to recognize that 
the construction of such redundant symmetry functions 
is necessarily system-specific, and the benefit of includ- 
ing them in the set in terms of lowering the fitting error 
can only be assessed by trial- and-error. 

To provide an example for such redundant symmetry 
functions, we will discuss below in the application to the 
dissociation of O2 at Al(lll) that we found it useful to 
include a term depending only on the molecule-surface 
separation 



G^ = e-^^ , (14) 



as well as three Fourier terms depending on the center 
of mass of the molecule in the same way as the atomic 
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X (a) X (a) X (a) 

FIG. 5: (Color online) Plot of the six symmetrized and anti-symmetrized Fourier terms of Eq. (|13p as a function of the X and 
Y center of mass coordinates of the molecule in units of the in-plane lattice constant a. The positions of the top layer surface 
atoms are marked by the white circles. In all plots the molecule has a distance of 2.1 A from the surface, a bond length of 
1.3 A and an angular orientation oi 6 = 90° and cj> = 30°. The absolute function values have no meaning, only the correct 
symmetry is required. 
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These terms do not break the symmetry of the functions 
G^ — G^ (or G'-^ — G'-"^) and are motivated by the fact that 
for a small almost spherical diatomic molecule the center 
of mass position will have a pronounced influence on the 
energy expression. Accordingly, although Eqs. (jlSHlTp 
contain redundant information, this does not complicate 
the fit, but instead "assists" the NN in extracting the 
energetically relevant structural information. 



E. Calculation of forces 

In order to perform molecular dynamics simulations 
the forces acting on the atoms are required. The force 
Fa acting in the direction of the molecular coordinate a 
can be obtained analytically from the neural network by 



_dE_ _ as ac'^ 



dGt' da 



(18) 
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FIG. 6: (Color online) Values of the synnnetry functions — as defined in Eq. (|13p for a molecular rotation about <jf> from 
0° to 360° , with cj> being the angle between the positive x axis and the projection of the molecular axis into the xy-p\axie. The 
insets show top views of the molecule above the surface, illustrating the lateral position and angular orientation. In (a) the 
molecule is parallel to the surface (6=90°) above a fee site. In (b) the molecule is in a fee site with an angle of 6=30° to the 
surface normal. In (c) the molecule is above a bridge site with 6=30°. In (d) the molecule is located at a low symmetry site 
(X=0.5a and Y—jg\/Za}j with 6=30°. For plotting, the function values have been rescaled and shifted since the absolute values 
have no meaning. 



The derivative of the total energy with respect to the 
symmetry functions, i.e., the input of the NN, is de- 
termined by the network structure. The derivative of 
the symmetry functions with respect to the molecular 
coordinates is given by the definitions of the symmetry 
functions, which thus need to have continuous deriva- 
tives. Both can be evaluated analytically, and using these 
analytic derivatives it is guaranteed that the forces are 
consistent with the energies, and in particular, that the 
forces are zero at local minima of the potential-energy 
surface. We found that the forces obtained from the neu- 
ral network are much more precise with respect to the 
symmetry of the problem than forces directly obtained 
from DFT, since in the latter case even in highly con- 
verged calculations the symmetry of the forces is often 
broken by numerical noise. 



IV. APPLICATION TO O2 DISSOCIATION AT 
AL(lll) 

We illustrate the use of the developed symmetry func- 
tions in the application to the O2 dissociation at an 
Al(lll) surface. Here, the six-dimensional PES repre- 



senting the interaction of an O2 molecule constrained to 
its spin-triplet state has been of particular interest to 
explain the experimentally measured low sticking coeffi- 
cient. For further details on the physics of this system, 
we refer to Refs. [2^ and [2^. Here this particular PES is 
simply taken as an example to illustrate the NN interpo- 
lation scheme. 



A. Mapping of the six-dimensional PES 

As input data to the NN, the six-dimensional PES has 
been mapped using density-functional theory as imple- 
mented in the DMoP code^i^^ and employing the RPBE 
functionaP^ to describe electronic exchange and correla- 
tion. The Kohn-Sham orbitals are expanded in a basis of 
numerical atomic orbitals and polarization functions^, 
and the spatial extent of the orbitals is confined by a 
cutoff of 9 Bohr. A mesh of (3x3x1) k-points has been 
used to sample the Brillouin zone. A Fermi broadening 
of 0.1 eV has been applied to improve convergence and 
the energy was subsequently extrapolated to K. The 
spin-triplet on the oxygen molecule is enforced by em- 
ploying a locally-constrained DFT approach described in 
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detail in Refs. A total of 3768 DFT data points 

has been calculated, mostly contained in 38 different, so- 
called "elbow-plots" , which give the PES as a function 
of the molecular bond length r and distance to surface 
Z for fixed molecular orientation and surface site. The 
energy zero for the PES has been defined for an infinitely 
large molecule-surface separation, i.e., as the sum of the 
total energy of the clean Al(lll) surface and a free O2 
molecule at its equilibrium bond length. 

A detailed account of the DFT calculations and selec- 
tion of the data points can be found in Ref. [H. Here, we 
only mention one aspect that was found to be of impor- 
tance for the NN fit. For small molecular bond-lengths 
and small molecule-surface separations the potential be- 
comes highly repulsive. This steep rise in the energy (and 
consequently larger energy range to be fitted) caused 
problems to the achieved accuracy of the NN fit. How- 
ever, these high-energy regions are not accessible in MD 
simulations which typically focus on a maximum molec- 
ular kinetic energy of 1 eV, and consequently a highly 
accurate mapping of these parts of the PES is not re- 
quired. We thus applied a cutoff function to constrain 
the highest potential energies to an energy threshold of 
Et = 5 eV using 

E-l ^ ^^^'-^^(19) 

) Et- exp (Et -AE-E) for E > Et - AE^ ' 

where AE has been set to 1 eV. The potential ener- 
gies up to +4, eV are thereby unmodified, while only the 
higher energies are quenched to approach Et asymptoti- 
cally. Constraining the highest energy of the PES in this 
way reduces the energy range to be fitted, and was found 
to yield more accurate NN fits in the remaining (relevant) 
energy range. 

B. NN optimization procedure 

From the total of 3768 DFT energy points, 96 ran- 
domly chosen points were used as independent test set 
not included in the NN optimization procedure. Differ- 
ent "fitting weights" (not to be confused with the weight 
parameters connecting the NN nodes) were assigned 
to the remaining DFT data points used to train the NN. 
The motivation for this was the requirement that the PES 
representation needs to be most accurate for the low- 
energy regions along the minimum energy paths. DFT 
data points corresponding to these regions were thus em- 
phasized in the fit by higher fitting weights, so as to en- 
force a most accurate reproduction by the optimized NN. 
After several tests, the fitting weights compiled in Table|T] 
were chosen for the NN optimization. 

In order to demonstrate the role of the symmetry func- 
tions, we proceed in two steps. First, fits using only func- 
tions — are constructed, which already contain all 
relevant structural information. In the next step fits us- 
ing all 11 symmetry functions G^ — G^^ are constructed 



and the effect of the additional, redundant functions on 
the numerical accuracy of the results is investigated. Two 
aspects of the resulting PES representations are of inter- 
est: First, the PES should be qualitatively correct, i.e., 
all symmetry features of the system must be present. Sec- 
ond, also the quantitative accuracy of the energies should 
be as close as possible to the underlying DFT data. 

All in all 64 different NN architectures with varying 
numbers of hidden layers, nodes per layer and parame- 
ters of the Kalman filter^ft have been tested. The best 
fit was obtained using two hidden layers with 40 nodes 
per layer, the hyperbolic tangent as activation function 
in the hidden layers and a linear activation function in 
the output layer. Using symmetry functions G^ — G*" 
only, i.e. the minimum structural information, the root 
mean square errors (RMSE) of the training and the test 
sets are 0.472 eV and 0.329 eV. The mean average de- 
viations (MAD) are 0.166 eV and 0.196 eV. Due to the 
higher fitting weights assigned to the low energy points 
(cf. Table |T| the accuracy of the fit is much better for 
these DFT data points, which represent the most impor- 
tant PES regions for the MD apphcations. The MAD 
for the points with E < 1.0 eV is only 0.070 eV and the 
points along the entrance channel having the O2 equi- 
librium bond length of r = 1.22 A have a particularly 
small error of only 0.019 eV, which is very important for 
an accurate description of steering effects acting on slow 
molecules approaching the surface. 

Employing all symmetry functions G^ — G^^ the RMSE 
of the full training and the test sets are reduced by one or- 
der of magnitude, to 0.049 eV and 0.070 eV, respectively. 
The MADs are 0.023 eV and 0.033 eV. This indicates 
that including the functions describing the center of mass 
of the O2 molecule strongly supports the fitting process 
for the nearly spherical O2 molecule. The fitting error 
for the important low-energy points with E < 1.0 eV is 
now reduced to 0.012 eV and the points along the en- 
trance channel having the O2 equilibrium bond length 
of r = 1.22 A have a very small error of only 1.4 meV. 
This excellent overall reproduction of the PES by the op- 
timized NN can also be seen by the three sample "elbow- 
plots" shown in Fig. [T] However, the inspection of two- 
dimensional cuts through the 6-dimensional PES is not 
sufficient to ensure an accurate representation of the PES 
in all dimensions. We therefore confirmed the accuracy 
of the NN PES by comparing several MD trajectories run 
on the NN PES with corresponding direct on-the-fly ab 
initio MD trajectories. After 0.2 ps, which is about the 
time required by a molecule with a translational kinetic 
energy of 0.15 eV starting at Z~5 A to approach the 
barrier region at Z=2.5 A, the positions in the NN PES 
MD and on-the-fiy ab initio MD differed in all cases by 
less than 0.1 A. 

Apart from this quantitative assessment of the fitting 
accuracy, the central aim of the present work is to pro- 
vide a scheme to construct NN potentials with the correct 
symmetry properties. In Fig. [5] the potential obtained 
using functions G^ — G^^ is plotted for the molecular ro- 
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Characterization of data points 


Weight 


E <1 eV 


2.0 


d = 1.224 A A (2.8 A < Z < 4.8 A) 


1000.0 


d = 1.224 A A Z > 4.8 A 


5000.0 


(1.21 A < d < 1.49 A) A (1.5 A < Z < 2.6 A) A (-1.0 eV < _B < +1.0 eV) 


11.0 


d = 1.3 A A Z = 2.1 A 


1067.0 



TABLE L Set of fitting weights assigned to the DFT data points used in the NN optimization. The large weights enforce a 
proper reproduction of the most relevant regions of the PES close to the minimum energy paths, which are characterized by 
low energies or bond lengths around the optimized free O2 bond length for large surface separations. 



(a) (b) (c) 




(d) (e) (t) 




r (A) r {A} r (A) 

FIG. 7: (Color online) Two-dimensional cuts ("elbow plots") through the six-dimensional spin-triplet potential-energy surface 
for the oxygen dissociation at the Al(lll) surface. The energy is shown as a function of the center-of-mass distance of the 
molecule from the surface Z and the O2 bond length r. In (a), (b) and (c) the elbow plots as obtained with DFT (white data 
points) and splined within the two dimensions are shown for the three difi'erent molecular orientations described in the insets. 
In (d), (e) and (f) the corresponding elbow plots obtained from the optimized NN potential are shown. Contour lines indicate 
energy differences of 0.2 eV. 



tations described in Fig.[Sl Clearly, the potential exhibits 
the correct symmetry properties of the different surface 
sites. 



V. DISCUSSION 

The excellent representation achieved in the model 
case 02/Al(lll) demonstrates that the proposed 
symmetry-adapted NNs are very well capable of accu- 
rately interpolating a given set of input PES data points. 
We acknowledge that mapping the molecular coordinates 



on slightly involved symmetry functions, as well as evalu- 
ating the NN output and its derivative on-the-fly in a MD 
simulation is computationally slightly more demanding 
than employing simple analytical potentials. Neverthe- 
less, evaluation of the NN PES is still about 5-6 orders 
of magnitude faster than direct DFT-based MD simula- 
tions, and is less susceptible to PES representation errors 
than the less flexible analytical potentials. 

A clear disadvantage of the present NN interpolation 
scheme is that consideration of further degrees of freedom 
requires a completely new NN optimization. Whereas 
it is for example straightforward to consider substrate 
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FIG. 8: (Color online) Symmetry of the NN potential-energy 
for a molecular rotation about (f> at different surface sites cor- 
responding to the plots of the symmetry functions in Fig. [6] 
The molecule above the fee site with 6 — 90° shows a six-fold 
symmetry like the set of symmetry function values in Fig. [6^. 
For 6 — 30° a three-fold symmetry is present like in Fig. |6)d, 
while above the bridge site there is the symmetry of Fig. |6j;. 
At the off-symmetry site there is no symmetry like in Fig. |6ji. 

motion in on-the-fly ab initio MD simulations, this re- 
quires a mapping and interpolation of a correspondingly 
higher dimensional PES in the NN scheme. One of 
us is presently developing a new NN approach to this 
problem^^, but the limited dimensionality that can be 
well treated and the ensuing limitation to the frozen- 
surface approximation are certainly the biggest draw- 
backs of the NN scheme as presented here. In this re- 
spect, it is only a weak justification, that consideration 
of only the molecular degrees of freedom has proven to be 
a valid approximation for a number of adsorbate systems. 

With this in mind, the systematic construction of sym- 
metry functions based on atomic Fourier terms does im- 
prove on present NN schemes and replaces the cumber- 
some empirical construction of complex symmetry func- 
tions depending simultaneously on many degrees of free- 
dom. The correct choice of symmetry functions is essen- 
tial for an accurate representation of the PES. Inaccurate 
symmetry functions can treat inequivalent configurations 
as equivalent, possibly facing the NN with the task to 
fit two different input energies to "nominally" the same 
structure. Such contradictions typically give rise to bad 
fits. Vice versa, truly equivalent structures should also 
yield the same set of symmetry function values, since only 
then the symmetry of the surface is exactly included in 
the PES representation (and in the forces) , and a numer- 
ical symmetry breaking affecting the MD trajectories is 
impossible. Both requirements are met by the current ap- 
proach. It is also important to repeat that the mapping 
of the coordinates is always done only in one direction: 
From the six molecular coordinates to the 11 symmetry 
functions for the training of the NN, as well as for the 
prediction of energies and forces of new structures. It 



is not necessary to reconstruct the original set of coor- 
dinates from the set of symmetry function values {G^}, 
which also allows us to use rather complicated symmetry 
functions that are hard to invert. 

An application of the here described Fourier method to 
other surface unit-cell shapes and sizes, i.e. other single- 
crystal surfaces, should in most cases be straightforward. 
For rectangular surface unit-cells, e.g. at fcc(lOO) sur- 
faces, the centers of three atomic Fourier terms can e.g. 
be placed at the top, bridge and hollow site to span the 
symmetry unique wedge of the surface. Also larger sur- 
face supercells may readily be used, e.g. in the presence 
of preadsorbed atoms or surface reconstructions, if the 
symmetry unique wedge of the surface is chosen accord- 
ingly. Finally, in principle it is also possible to describe 
the simultaneous interaction of two or more diatomic 
molecules with the surface. In this case a set of symme- 
try functions is constructed for each of the two molecules 
and augmented by terms describing the relative position 
of the molecules. However, in this well as in the 

case of poly-atomic molecules, the configuration space, 
i.e. the dimensionality of the problem, is strongly in- 
creased making a systematic mapping of the PES more 
costly. This is particularly consequential, since it cannot 
be overstressed that NNs are not able to extrapolate the 
energies of structures outside the configuration spanned 
by the input data set. A systematic mapping of the entire 
PES range of interest as done in the 02/Al(lll) model 
case is thus a prerequisite for the presented approach. 



VI. SUMMARY 

We have presented a symmetry-adapted neural net- 
work representation of potential-energy surfaces for 
molecule-surface interactions. The method builds on 
symmetry functions, which fully take the symmetry of 
the surface into account. Instead of the molecular coor- 
dinates, the values of these symmetry functions are used 
as input for the neural network. The symmetry functions 
are constructed in a systematic way from atomic Fourier 
terms and the construction recipe should be readily ap- 
plicable to a wide range of surface unit-cell sizes and 
shapes. The accuracy of the method has been demon- 
strated by interpolating the six-dimensional potential- 
energy surface of an oxygen molecule in the spin-triplet 
state interacting with the Al(lll) surface. 
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